home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
IRIX Patches 1995 March
/
SGI IRIX Patches 1995 Mar.iso
/
relnotes
/
patchSG0000226
/
ch1.z
/
ch1
Wrap
Text File
|
1995-03-10
|
26KB
|
661 lines
- 1 -
1. _R_e_l_e_a_s_e__N_o_t_e_s__f_o_r__I_R_I_X__5_._2__p_a_t_c_h__S_G_0_0_0_0_2_2_6
This release note describes patch SG0000226 to IRIX 5.2. It
contains the following information:
+o Hardware platforms supported
+o List of the bugs which are fixed by this patch
+o Compatability considerations
+o List of subsystems included in this patch
+o Installation instructions
1.1 _H_a_r_d_w_a_r_e__p_l_a_t_f_o_r_m_s__s_u_p_p_o_r_t_e_d
This patch to IRIX 5.2 supports the following machine types:
+o Challenge and Onyx with R4400 processors
+o Crimson (4D/510)
+o PowerSeries (4D/120, 4D/2xx, 4D/3xx and 4D/4xx)
It will not install on any other type of machine.
1.2 _B_u_g_s__f_i_x_e_d__b_y__p_a_t_c_h__S_G_0_0_0_0_2_2_6
This patch contains fixes for the following problems which
exist in IRIX 5.2 (bug numbers from SGI bug tracking system
are included for reference):
+o The NFS server implementation in Irix did not return
any value in the microseconds field for the modify time
on the return from a getattr call. (rfe #251693)
+o When writing to a full file partition across NFS when
using the O_SYNC flag on a 5.2 client, once a ENOSPC
error is returned, further writes while the file is
open would fail, even if space is available to write on
the remote file system. (bug #251853).
+o Multiprocessor systems acting as NFS servers can crash
if multiple operations attempt to update the list of
exported NFS filesystems simultaneously (bug 141828).
+o When dbx is started on a user VME driver which maps in
VME bus memory via the mmap() system call, attempting
to print the contents of that memory using dbx crashes
the machine.
- 2 -
This was true for the Challenge/Onyx machines as well
as the older 4D series machines.
This bug is fixed in this patch release. It is now
possible for a user VME process to be run under dbx and
view the variables in VME memory using dbx primitives
like print. This allows users to examine the variables
as bytes, half-words, or words depending on the type of
VME memory.
Trying to print a structure whose size is greater than
the maximum access size supported by the VME board (in
the case of D16 it could be 2 bytes, for D32 it could
be 4 bytes, etc) still does NOT work. This is
primarily due to VME bus access being size sensitive. A
D16 board may only respond to 2 byte size access.
Trying any other size access could cause problems for
the board. As a result, when a user tries to print a
structure, the kernel does not know what size operation
is right for the VME address space in question. If the
size is more than 8 bytes, it returns an error. (bug
189318).
+o Lost clock interrupts on Power Series machines cause
time to drift under a heavy system load. This patch
provides a temporary workaround to the problem on
machines equipped with an IO3 board. Machines which do
not have an IO3 board installed will be uneffected by
this patch (bug 192233).
+o The _d_f(1) command can return a negative number as the
count of blocks used on the /_p_r_o_c file system under
some conditions (bug 193935).
+o The IO4 serial driver (only applicable on Challenge and
Onyx) has been modified for enhanced serial performance
at sustained high baudrate. The interrupt priority of
the duarts has been increased to prevent long interrupt
masking under heavy loads from causing the duart
hardware to drop incoming characters. As a side effect,
it is now possible to program the duarts to interrupt
less frequently, reducing the cost in cpu usage of
heavy serial traffic. Users should see better
performance at a lower cost.
Users are advised, however, that since the duart now
interrupts at a very high priority, it is now possible
to bring the entire machine to a halt by flooding it
with serial traffic on a large number of ports at
maximum baudrate. The machine may reach a state where
it spends 100% of its time handling serial interrupts.
- 3 -
Note that only the master cpu is actually tied up in
this fashion, but other cpus may also be tied up
waiting for the master cpu to release a needed
resource.
The high bandwidth capability and decreased cost are
nullified if the _d_u_a_r_t__r_s_r_v__d_u_r_a_t_i_o_n timeout variable
is configured to 0 at kernel build time, or if this
value is reset at runtime with the SIOC_ITIMER _i_o_c_t_l
command (see _s_e_r_i_a_l(7)). A 0 value in this case
indicates that the user wishes the smallest possible
latency receiving characters, and all of the tricks to
improve high baudrade performance entail some latency,
so they cannot be used in this case. The user can
expect a significant performance penalty when setting
this timeout to 0. (bug 200377)
+o There is a bug in the gang scheduler that prevents
priority from being observed between gangs on the same
queue. Additionally, the batch gang queue can
improperly run gangs even though the gang queue has
valid work in it. (bug 200394).
+o There is a bug in the code that keeps track of the IP
multicast addresses that a host is accepting. Systems
which use IP multicasting occasionally have some
multicast addresses deleted when they are still in use
or continue to listen for multicast addresses that are
no longer in use (bug 201283).
+o Multiprocessor Challenge and Onyx machines running IRIX
5.2 can hang as a result of a software deadlock (bug
204252).
+o The IRIX Extent File System code in 5.2 has the
property that files which are open and have been
extended since the last time they were closed are
likely to be lost when the system crashes for any
reason. Changes have been made to the file system code
in the kernel and to the file system check utility
(_f_s_c_k) to reduce significantly the amount of data that
is lost when the system crashes or loses power with
extended application files still open. Note that there
is still no guarantee that all writes done by
applications will be preserved across a system crash.
The file system buffers writes and commits the data to
disk asynchronously by design (bug 204253).
+o Extending a file by writing to it on an NFS mounted
file system was slower than it should have been because
of incorrect interactions between the NFS server code
- 4 -
and the file system on the server side (bug 204732).
+o MP protection was added around automounter updates to
/etc/mtab.
+o The automounter no longer attempts to mount over an
already mounted (root) child filesystem. The child
filesystem will be the root, if the client is unable to
mount the parent filesystem, due to permissions,
timeouts, etc. (bug 172695).
+o There is a race condition in the communication between
the kernel and the local lock manager that can cause
NFS to hang (bug 205438).
+o There is a race condition in NFS client handle
allocation that can cause NFS to hang under heavy loads
on the client (bug 205453).
+o Programs which request extremely large reads or writes
(> 100Kb) on character or block devices can make very
slow progress and degrade the performance of the system
(bug 205422).
+o The profiling clock was running continuously on all
processors, even when no profiling was in progress.
This bug affects Challenge and Onyx only (bug 206673).
+o Under certain loads, the system occasionally appears to
be idle for periods up to 90 seconds, even when there
are active jobs that should be running. This bug
affects Challenge and Onyx only (bugs 193082 and
207844).
+o Internet port numbers that can be automatically
assigned were limited to 5000. This has been increased
to 65535.
+o Several software deadlocks that can cause the system to
hang have been fixed (bug 208087).
+o The normal diagnostics run at system powerup leave some
error bits set in the hardware that were not being
completely cleared by the operating system at boot
time. This residual error state causes other hardware
errors to be misdiagnosed. The kernel boot code now
clears these error bits. This bug affects Challenge
and Onyx only (bug 209406).
+o There is an error in the system audit trail mechanism
that can cause the system to crash when handling
- 5 -
pathnames of certain formats (bug 212708).
+o There is an error in the gang scheduler that can, under
certain circumstances, cause a gang to starve if it has
a large number of processes associated with it (bug
214170).
+o Multiprocessing EVEREST systems failed to start all
processors for different software configurations. This
problem was most notable with a kernel linked for
debugging, and no symmon available at boot time.
Usually only the master CPU would boot, with all slaves
failing to start. The problem has also been seen on
non-debug versions of the kernel. The problem was
caused by a race condition between the master CPU and
the slaves during the early boot process. The
appropriate synchronization is implemented in this
patch (bug 214364).
+o Using a regular file in a file system as a
supplementary swap area can cause the system to crash
during heavy swapping (bug 214374).
+o The combination of heavy outbound network traffic using
large buffers (as is done by doing ftp puts, for
example) and heavy page aging by the virtual memory
system when free memory is low can cause a
multiprocessor system to hang in the page flipping code
(bug 216587).
+o The disk quotas facility did not work in the previous
IRIX 5.2 patches SG0000001 and SG0000022. This has
been fixed in this patch release.
+o There is a performance problem related to the creation
of sproc children that have local mappings that results
in excessive rfault rates for the child. This occurs
when the parent has the local mapping already in its
address space when sproc is called (bug 222221).
+o A particular POSIX conformance bug that caused the
updating of file access time for a read on a file on a
read only file system has been fixed. (bug 223286).
+o Another POSIX conformance bug whereby a fcntl to dup a
file descriptior a number of times so as to exceed the
user's allowable number of open file descriptors
returned error EMFILE has been changed to return
EINVAL. (bug 223492).
- 6 -
+o Still another POSIX conformance bug that has been
closed in this patch is the fact that if a process
which handles SIGCONT is sleeping inside a system call
at an interruptible priority, the sending of a job
control stop signal will interrupt the system call,
returning -1 and setting errno to EINTR. (bug 223509).
+o Sproc processes that exec may carry with them wrong
pages from parent; potentially causes the execed
process to hang (bug 227235).
+o Data base systems using asynchronous I/O could corrupt
data. However, only SYBASE version 10 is known to
trigger this problem (bug 229896).
+o Support for 4MB secondary cache systems in both IRIX
and IO4 prom.
+o The IO4 prom image also incorporates new segment loader
software, and multiple versions of the IO4 software for
different architectures. In particular, this version of
the IO4 prom supports both IP19 and IP21 CPU boards.
Some Scsi initialization and time out values were
changed to recover from the system attempting to boot
from a disk that is not "ready" yet.
+o The IO4prom did not allow booting from a SCSI disk that
was above address 7 on the SCSI bus (bug 240879).
+o A new feature was introduced in patchSG0000022 for
EVEREST systems only (Challenge and Onyx with R4400
processors). For a certain class of memory errors,
recovery is possible in software because the data lost
is no longer required. This feature was disabled by
default, but is now enabled in patchSG0000226. For
example, errors which occur when zeroing a new page for
a task may be safely ignored, since the previous data
on the page is no longer needed.
The kernel variable "ecc_recover_enable" enables and
disables this recovery feature. A value of 0 indicates
that recovery should not be attempted. A non-zero
value represents the number of seconds over which 32
error recovery attempts can be made. In general, a
value of 60 should be used to enable this feature.
This is the value that is now enabled by default.
+o A bug that occured as part of patch 33 where on EVEREST
machines the NOINTR directive could be ignored, causing
problems with real time latency has been corrected.
(bug 235061)
- 7 -
+o Power Series and Crimson systems with dual VME buses
did not support user mode access (/dev/vme) properly.
A16 mode on the second bus did not work, and A32 did
not work on either bus. These problems are corrected.
+o VME write error handling for Challenge/Onyx systems was
not taking care of corner cases where a VME write error
followed by a VME read error would cause the systems to
crash in certain situations (bug 231142).
+o Fixed the problem where stressing a Power series system
with multiple ethernets (et0, and enp0) would cause the
network subsystem to hang. This was also causing SCSI
subsystem to hang (bug 188296).
+o System calls stat() or xstat() on a tty file can hang
in kernel mode, leaving the process unkillable (bug
230375).
+o Multiprocessor systems with R4000 cpus or R4400 cpus at
revision level 2.2 or less and which use loadable
drivers can crash due to a kernel segmentation
violation. Such systems with loadable drivers can
panic with the cause being an RMISS and the bad_addr
not matching the faulting pc. The workaround installed
in the kernel detects this particular path when it is
caused by an R4000 bug and allows the operation to be
retried, which is needed to correct the problem (Bug
#236338).
+o Changes were put into the kernel which allow the kernel
stack to be increased by an additional page for real-
time processes and otherwise on an as-needed basis.
This increases the reliability of the system by
eliminating scenarios in which the kernel stack might
overflow and panic the system, which occasionally arose
in systems making heavy use of remote file systems, for
example (Bug #240710).
+o Fixed a problem where a binary compiled on IRIX 4.0.5
that uses libc function getcwd() will fail on an IRIX
5.2 machine that has a raid filesystem when the binary
is run on the raid filesystem (bug 234992).
+o Under certain circumstances, mail could experience a
deadlock in accessing its lock file in an nfs-mounted
mail directory. This fix makes it possible for users
to have nfs-mounted mail directories (bug 228720).
+o Writing to a named pipe over nfs could cause a system
panic. This problem could occur running previous Irix
- 8 -
5.2 patches patchSG0000022, patchSG0000030,
patchSG0000033 or patchSG0000047, all of which are
replaced by patchSG0000226.
+o The fuser command could cause a system panic when used
on a machine with heavy socket creation/deletion
activity (bug 209242).
+o Irix 5.2 patchSG0000047 introduced a problem whereby an
EFast ethernet board would not be seen on the VME bus
of a PowerSeries (4D/120, 4D/2xx, 4D/3xx and 4D/4xx).
That problem is fixed in patchSG0000226, which replaces
patchSG0000047.
1.3 _C_o_m_p_a_t_a_b_i_l_i_t_y__c_o_n_s_i_d_e_r_a_t_i_o_n_s
This patch includes slight content changes to the following
system header files as a part of the fix to prevent any
possible kernel stack overlow:
+o "/usr/include/sys/param.h"
+o "/usr/include/sys/proc.h"
These changes were made in a way to minimize compatibility
concerns, but it is still possible that software sensitive
to the exact kernel proc struct may need to be rebuilt.
For this reason, sites using CASEVision(m/ClearCase must
rebuild the MFS (MVFS for ClearCase 2.0 users) after
installing patchSG0000226. As the root user, execute the
following instructions and then reboot. The CPUBOARD value
IPxx may be determined from the _h_i_n_v(1M) command.
If you are running ClearCase 1.1.4:
%%%% ssssuuuu
#### sssseeeetttteeeennnnvvvv CCCCPPPPUUUUBBBBOOOOAAAARRRRDDDD IIIIPPPPxxxxxxxx
#### ccccdddd ////vvvvaaaarrrr////ssssyyyyssssggggeeeennnn////bbbbooooooootttt
#### mmmmaaaakkkkeeee ----ffff ////vvvvaaaarrrr////ssssyyyyssssggggeeeennnn////MMMMaaaakkkkeeeeffffiiiilllleeee....kkkkeeeerrrrnnnniiiioooo mmmmffffssss____ppppaaaarrrraaaammmm....oooo
#### mmmmvvvv mmmmffffssss....oooo mmmmffffssss....oooo....oooolllldddd
#### lllldddd ----oooo mmmmffffssss....oooo ----rrrr pppprrrreeeemmmmffffssss....oooo mmmmffffssss____ppppaaaarrrraaaammmm....oooo
#### ////eeeettttcccc////aaaauuuuttttooooccccoooonnnnffffiiiigggg ----ffff
If you are running ClearCase 2.0:
%%%% ssssuuuu
#### sssseeeetttteeeennnnvvvv CCCCPPPPUUUUBBBBOOOOAAAARRRRDDDD IIIIPPPPxxxxxxxx
#### ccccdddd ////vvvvaaaarrrr////ssssyyyyssssggggeeeennnn////bbbbooooooootttt
#### mmmmaaaakkkkeeee ----ffff ////vvvvaaaarrrr////ssssyyyyssssggggeeeennnn////MMMMaaaakkkkeeeeffffiiiilllleeee....kkkkeeeerrrrnnnniiiioooo mmmmvvvvffffssss____ppppaaaarrrraaaammmm....oooo
#### mmmmvvvv mmmmvvvvffffssss....oooo mmmmvvvvffffssss....oooo....oooolllldddd
#### lllldddd ----oooo mmmmvvvvffffssss....oooo ----rrrr pppprrrreeeemmmmvvvvffffssss....oooo mmmmvvvvffffssss____ppppaaaarrrraaaammmm....oooo
- 9 -
#### ////eeeettttcccc////aaaauuuuttttooooccccoooonnnnffffiiiigggg ----ffff
Note: If you remove patch 226 you need to perform these
same steps.
1.4 _S_u_b_s_y_s_t_e_m_s__i_n_c_l_u_d_e_d__i_n__p_a_t_c_h__S_G_0_0_0_0_2_2_6
This patch includes changes to the following IRIX 5.2
products: _c_o_m_p_i_l_e_r__d_e_v, _e_o_e_1, _e_o_e_2, _n_f_s and _d_e_v. The
patchSG0000226 image contains the following subsystems:
+o patchSG0000226.compiler_dev_sw.dbx
+o patchSG0000226.dev_hdr.lib
+o patchSG0000226.eoe1_sw.quotas
+o patchSG0000226.eoe1_sw.unix
+o patchSG0000226.eoe2_sw.audit
+o patchSG0000226.eoe2_sw.kdebug
+o patchSG0000226.eoe2_sw.perf
+o patchSG0000226.nfs_sw.nfs
1.5 _I_n_s_t_a_l_l_a_t_i_o_n__i_n_s_t_r_u_c_t_i_o_n_s
This patch is only installable on systems running IRIX 5.2.
This patch requires installation in miniroot mode. To
perform the installation, take the system down and follow
the normal procedures for starting up the installation tool
from the supplied release media. It is recommended that you
select all the patch subsystems that correspond to software
already installed on the system.
This patch will install on systems running IRIX 5.2, or on
Challenge or Onyx systems with the 5.2-200MHz release
installed to support IP19 200MHz CPU boards. In the case of
installing on the 5.2-200MHz release, inst will note an
apparent version mismatch for the subsystem
patchSG0000226.eoe1_sw.unix, as noted by:
kkkk NNNN ppppaaaattttcccchhhhSSSSGGGG0000000000000000222222226666....eeeeooooeeee1111____sssswwww....uuuunnnniiiixxxx @@@@ 0000 11114444666666665555++++ IIIIRRRRIIIIXXXX EEEExxxxeeeeccccuuuuttttiiiioooonnnn EEEEnnnnvvvviiiirrrroooonnnnmmmmeeeennnntttt
For correct installation of patchSG0000226, it is necessary
to issue the following inst command:
sssseeeetttt nnnneeeewwwweeeerrrroooovvvveeeerrrrrrrriiiiddddeeee oooonnnn
- 10 -
in order to force inst to install
patchSG0000226.eoe1_sw.unix.
One way in which software patches differ from full releases
and maintenance releases is that patches are reversible:
you can remove the patch and restore the installed software
to its state before the patch was applied. This is done by
using the _v_e_r_s_i_o_n_s command as superuser:
vvvveeeerrrrssssiiiioooonnnnssss rrrreeeemmmmoooovvvveeee ppppaaaattttcccchhhhSSSSGGGG0000000000000000222222226666
Since this patch replaces some kernel object files, it is
necessary to rebuild the kernel image and reboot after
removing the patch:
aaaauuuuttttooooccccoooonnnnffffiiiigggg
rrrreeeebbbbooooooootttt